Goto

Collaborating Authors

 inference stage optimization



Inference Stage Optimization for Cross-scenario 3D Human Pose Estimation

Neural Information Processing Systems

Existing 3D human pose estimation models suffer performance drop when applying to new scenarios with unseen poses due to their limited generalizability. In this work, we propose a novel framework, Inference Stage Optimization (ISO), for improving the generalizability of 3D pose models when source and target data come from different pose distributions. Our main insight is that the target data, even though not labeled, carry valuable priors about their underlying distribution. To exploit such information, the proposed ISO performs geometry-aware self-supervised learning (SSL) on each single target instance and updates the 3D pose model before making prediction. In this way, the model can mine distributional knowledge about the target scenario and quickly adapt to it with enhanced generalization performance. In addition, to handle sequential target data, we propose an online mode for implementing our ISO framework via streaming the SSL, which substantially enhances its effectiveness. We systematically analyze why and how our ISO framework works on diverse benchmarks under cross-scenario setup. Remarkably, it yields new state-of-the-art of 83.6% 3D PCK on MPI-INF-3DHP, improving upon the previous best result by 9.7%.


Inference Stage Optimization for Cross-scenario 3D Human Pose Estimation (Supplementary Material)

Neural Information Processing Systems

We compute the limb length ratios of upper to lower arm and leg (both for the left and right sides) as well as torso, for geometric distribution analysis. The joints and body parts of interest are defined in Fig. S1. All the results are reported under unscaled protocol. How does the choice of self-supervised learning technique impact accuracy? We can observe Adv ( Joint, V anilla and Online settings) improves accuracy upon Baseline by a large margin.



Review for NeurIPS paper: Inference Stage Optimization for Cross-scenario 3D Human Pose Estimation

Neural Information Processing Systems

Weaknesses: Though the authors shadow many insights on why ISO performs well, I still have questions about the Shared Feature Extractor, SSL Head, FSL Head. As the SSL is from existing work and the main contribution is combination of SSL with FSL, answering the questions clearly is important. Which kind of feature, information is shared in the Shared Feature Extractor? How much will it divert when trained on new target data so that is causes the FLS head fail? What information is kept in the FSL head?


Review for NeurIPS paper: Inference Stage Optimization for Cross-scenario 3D Human Pose Estimation

Neural Information Processing Systems

This paper proposes and inference stage optimization to improve 3d human pose estimation. All reviewers recommend acceptance but the paper can benefit from additional analysis and clarity with respect to existing work. Please include the additional references and provide the clarifications requested during reviewing.


Inference Stage Optimization for Cross-scenario 3D Human Pose Estimation

Neural Information Processing Systems

Existing 3D human pose estimation models suffer performance drop when applying to new scenarios with unseen poses due to their limited generalizability. In this work, we propose a novel framework, Inference Stage Optimization (ISO), for improving the generalizability of 3D pose models when source and target data come from different pose distributions. Our main insight is that the target data, even though not labeled, carry valuable priors about their underlying distribution. To exploit such information, the proposed ISO performs geometry-aware self-supervised learning (SSL) on each single target instance and updates the 3D pose model before making prediction. In this way, the model can mine distributional knowledge about the target scenario and quickly adapt to it with enhanced generalization performance.